Improving Imbalanced data classification accuracy by using Fuzzy Similarity Measure and subtractive clustering

Authors

Abstract:

 Classification is an one of the important parts of data mining and knowledge discovery. In most cases, the data that is utilized to used to training the clusters is not well distributed. This inappropriate distribution occurs when one class has a large number of samples but while the number of other class samples is naturally inherently low. In general, the methods of solving this kind of problem are divided into two categories: under-sampling and over-sampling. In this paper, an under-sampling method using subtractive clustering and fuzzy similarity measure was will be presented and their performances were are analyzed in terms of efficiency in classifying imbalanced data. For this purpose, the subtractive clustering is first conducted and the majority class data is clustered. Then, using fuzzy similarity measure, samples of each cluster were  will be ranked and appropriate samples were are selected based on these rankings. The selected samples together with the minority class constituted create the final dataset. In this research, MATLAB software is used for implementation, the results are evaluation evaluated by using AUC criterion and analyzing results performed by using standard statistical. The experimental results show the effectiveness of proposed method to other methods of under-sampling.

Upgrade to premium to download articles

Sign up to access the full text

Already have an account?login

similar resources

On Mining Fuzzy Classification Rules for Imbalanced Data

Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...

full text

On Mining Fuzzy Classification Rules for Imbalanced Data

Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...

full text

the clustering and classification data mining techniques in insurance fraud detection:the case of iranian car insurance

با توجه به گسترش روز افزون تقلب در حوزه بیمه به خصوص در بخش بیمه اتومبیل و تبعات منفی آن برای شرکت های بیمه، به کارگیری روش های مناسب و کارآمد به منظور شناسایی و کشف تقلب در این حوزه امری ضروری است. درک الگوی موجود در داده های مربوط به مطالبات گزارش شده گذشته می تواند در کشف واقعی یا غیرواقعی بودن ادعای خسارت، مفید باشد. یکی از متداول ترین و پرکاربردترین راه های کشف الگوی داده ها استفاده از ر...

Hysteresis Modeling using Fuzzy Subtractive Clustering

This paper summarizes work undertaken in the area of modeling Shape Memory Alloy (SMA) and airfoil hysteresis using a Sugeno-type fuzzy modeling approach based on subtractive clustering. Two alternative approaches to develop a fuzzy model for hysteresis are proposed and evaluated. The first consists in building a mirror image of the lower curve in order to model both curves concurrently and the...

full text

on mining fuzzy classification rules for imbalanced data

fuzzy rule-based classification system (frbcs) is a popular machine learning technique for classification purposes. one of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. however many cases the minority classes are more important than the majority ones. in this paper, we have extended ...

full text

My Resources

Save resource for easier access later

Save to my library Already added to my library

{@ msg_add @}


Journal title

volume 19  issue 2

pages  27- 38

publication date 2022-09

By following a journal you will be notified via email when a new issue of this journal is published.

Keywords

No Keywords

Hosted on Doprax cloud platform doprax.com

copyright © 2015-2023